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Abstract — Deconvolution is essential for radio interferometric 
imaging to produce scientific quality data because of finite 
sampling in the Fourier plane. Most deconvolution algorithms 
are based on CLEAN which uses a grid of image pixels, or clean 
components. A critical matter in this process is the selection of 
pixel size for optimal results in deconvolution. As a rule of thumb, 
the pixel size is chosen smaller than the resolution dictated by 
the interferometer. For images consisting of unresolved (or point 
like) sources, this approach yields optimal results. However, for 
sources that are not point like, in particular for partially resolved 
sources, the selection of right pixel size is still an open issue. 
In this paper, we investigate the limitations of pixelization in 
deconvolving extended sources. In particular, we pursue the usage 
of orthonormal basis functions to model extended sources yielding 
better results than by using clean components. 

Index Terms — Radio interferometry, Image deconvolution, Es- 
timation theory. 

I. Introduction 

Due to the incomplete Fourier plane sampling of radio 
synthesis observations, deconvolution is essential for making 
high fidelity images. The traditional CLEAN [1] algorithm 
and its variants are widely used for this deconvolution. There 
are several limitations of clean being applied to a typical 
image. First, the centroids of point sources will not exactly 
match pixel coordinates on a regular grid. Secondly, some 
sources might be extended, thus requiring more than one clean 
component. 

Therefore, since its invention, several studies on the perfor- 
mance of clean and its limitations have been conducted. In 
[2| for instance, the convergence and residual errors in terms 
of a least squares fit for the Fourier plane data is discussed. 
The work (3| (ch. 6) focuses mainly on clean components 
and their analogy to Fourier components of the sky image. 
The fundamental limitations of image pixelization (or having 
a regular grid of clean components) is studied in @), especially 
in the case where sources are located off a pixel center. 

In this paper, we focus on improving the deconvolution of 
bright, extended sources that are barely resolved. In order to 
arrive at our results, we use statistical estimation theory to 
derive some fundamental limits of clean in deconvolving such 
sources. Compared to the work of J4| which give numerical 
bounds, we derive analytical bounds on its performance. More- 
over, compared to [3) which takes a deterministic approach 
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to study clean component placement, we take a statistical 
approach to derive the Cramer-Rao lower bound 13, (6). 

The limitations of deconvolving an extended source with 
a set of clean components have been overcome by using 
clean components that have different scale sizes. For instance, 
0>[E) gives a comprehensive overview of this approach and 
comparisons with similar existing approaches. However, using 
multi scale pixels would still be limited by the resolution 
limit of the interferometer in case of barely resolved sources. 
In order to overcome this deficiency, we consider using a 
two dimensional orthonormal basis instead of a set of clean 
components. As a real example of this technique, we select 
one such basis called the shapelet basis J9) and apply this 
technique to Westerbork Synthesis Radio Telescope (WSRT) 
low frequency observations. Shapelets have been extensively 
used in astronomical image processing applications ll9l. lfT0l . 
including deconvolution of radio interferometric data ifTTl . In 
this paper, we use shapelets for high fidelity imaging when 
clean based algorithms perform poorly. 

II. Mathematical preliminaries 

In this section we derive some fundamental limitations of 
clean and try to explain the reason that an orthonormal basis 
could improve on using clean components to model extended 
structure. For simplicity, we first consider a one dimensional 
image and its corresponding Fourier plane (axis). The image 
axis is given by I and the corresponding visibility axis is u. The 
corresponding units are radians and wavelengths, respectively. 

A. Interferometric imaging 

Let us consider observing a point source at the origin, whose 
brightness is given by 5(1) (the Dirac delta function). The 
visibilities correspond to the Fourier transform of 8(1), which 
is 1. We only observe at a set of discrete points on the u 
axis, and this is equal to sampling by the weighted sampling 
function 



U(u) 



w(ui)S(u - Ui) 



(1) 



where w(iii) correspond to the weight we assign to the z-th 
sampling point, which is at itj on the u axis. For the remainder 
we consider all weights to be unity. The observed image /(/) 
is the inverse discrete Fourier transform 



1(1) = 5^(5ZW(u-«i))exp(j27rl« fc ) 
= y%xp(j27rln fc ) 



(2) 



which is the point spread function (PSF). In order to denote 
the nominal resolution limit, we use b = \/max(\u\). 

B. Pixelization error 

Now, let us consider a point source which is displaced by Iq 
from the origin, which has 70 brightness, which is represented 
by jqS(1 — Iq). The sampled visibility at the z-th point on the 
u axis is given by 



yi = 70 exp(-j2nloUi) + n t 



(3) 



where m is the observation noise. We assume the noise to 
be white, uncorrected complex (circular) Gaussian with zero 
mean and variance a 2 . 

Due to pixelization, we represent this point source with 
the pixel at the origin, if Iq is small enough. We estimate 
the magnitude a of the clean component at the origin by 
minimizing the least squared error. The error at the i-th 
sampling point will be 

Ci = 70 exp(-j27r? Wi) + rij - a (4) 
and the total (average) error to be minimized is 

e 2 = ^ E &&} = i Yl 7o 2 +^ 2 +a 2 -2a7o cos(27rfou<) 



(5) 

where N is the total number of sampling points on the u axis. 
The solution for a is obtained by solving = 



(6) 



and substituting (O to (0 gives the minimum error, £j = 
70 exp(— j2irlQUi) + n, — a. This error can be minimized by 
shifting the pixel grid by Iq, as shown in J4). Hence, this does 
not cause a real problem in deconvolution. 

C. Clean component placement at arbitrary locations 

We relax the pixelization requirement and assume we could 
place a clean component at any location. As before, we 
observe a point source, with magnitude 70, positioned at Iq 
on the image axis. The noisy observed visibilities are given 
by ©. 

Let y = [yi,y 2 , ■ ■ ■ ,2/jv] t and u = [m, u 2 , ■ ■ ■ , u N ] T 
represent the vectorized forms of the observed visibilities and 
the observation coordinates, respectively. We assume there 
are N observation points, the aforementioned vectors have 
dimensions JV x 1. The likelihood of the observation is given 
by 



p(y\o- ,/ ,7o,u) 
1 

= 7 ?\n CX P 

[7T(T ) 



(7) 



The variance of estimating Iq and 70 (or the Cramer-Rao 
lower bound) are given by 



Var(l ) > 



87o 2 ^ 2 E^ 2 



Var^o) > — (8) 



Note that similar results have been derived for a single 
interferometer J6), or a single point in u. We see from ([8]) 
that the error in estimation of Iq is not only dependent on the 
noise a 2 , but also dependent on the sampling points on the u 
axis, which is the resolution limit of the interferometer. 



D. Partially resolved sources 

The more challenging case in deconvolution is when the 
source cannot be represented as a pure point source, or as 
a single clean component. The simplest example for this is 
having two sources, with magnitudes 70 and 71, shifted by Iq 
and ii, respectively. The observed (dirty) image is 



1(1) 



70 exp(-j27rZ u l )+7i cxp(-j27WiUi) ) cxp(j2irlu l ) 

(9) 



The ability to correctly estimate the magnitudes and positions 
is dependent on the sampling on the u axis. For some cases, 
we will be unable to estimate them accurately, as we shall see 
later. Obviously, this happens when the two sources become 
closer together. In Figs. [TJ and |2j we have presented dirty 
images (and the corresponding visibility amplitudes) for barely 
resolved and unresolved cases, respectively. 





(a) 



(b) 



Fig. 1. Two point sources, almost unresolved case. The image (a) and the 
corresponding visibility amplitudes (b) are given. The magnitudes are 70 = 
1.0, 71 = 0.4. The positions are irj = 0.16, h = 0.86, where 6 is the 
resolution. Peaks are clearly seen at positions close to irj and ii. 





(a) 



(b) 



Fig. 2. Two point sources, unresolved case. The image (a) and the corre- 
sponding visibility amplitudes (b) are given. The magnitudes are 70 = 1.0, 
71 = 0.4. The positions lo = 0.16, l\ = 0.26 are too close to be resolved 
due to the finite resolution 6. 



As usual, the observed visibilities are given by 

y l = 70 cxp(-j27W Ui) + 71 cxp(-j27rZiu l ) + m (10) 



and the likelihood is 

p(y|o- 2 ,Z ,Zi,7o,7i,u) (11) 
= Jj^ayr exp (^T Yl ^ ~ ^° C M-J^M 

-71 exp(-j27r/iu !; )| 2 ^ 
The ML estimate is obtained by minimizing the cost J 
J = —5^2 \Vi\ 2 - Vi (70 exp(j2irl Ui) + 71 exp^u,) ) 

i ^ ' 

-Vi (lo exp(-j2irl Ui) + 71 cxp(~j2irl 1 u i )J 

+7o + 7? + 27o7i cos(27r(/ - h)u ) (12) 

with respect to IqJi^q, and 71. Let = [Iq, h,7o, 7i] T be the 
parameter vector to be estimated. Then the Fisher information 
matrix is given by 

™ = - E ^^J} (13) 

and the Cramer Rao bound is given by the diagonal entries of 
the inverse of T '(0): 

Var(!o) > [T- 1 {0)] 1Al Var{h) > [^(fl)]^, (14) 
Var(j Q ) > [^ 1 (0)] 3 ,3, Var(ffi) > [^(d)}^. 
In Fig. [3] we have given the CRLB for our example case. 




(a) (b) 

Fig. 3. Variation of the Cramer Rao lower bound with the spacing between 
two point sources. The variance in estimating the positions Iq,Ii are given 
in (a) and the variance in estimating the magnitudes 70 and 71 are given in 
(b). The true magnitudes are 70 = 1.0, 71 = 0.4 while the noise variance is 
<t 2 = 0.2 2 . It is clearly seen that as the two sources come closer than about 
0.46, the variance (or the estimation error) increases significantly. 

It is straightforward to extend the results derived in ( TBi i for 
a two dimensional visibility plane, where u and v are the vis- 
ibility axes. We again consider two sources, with magnitudes 
7o>7ii positioned at (lo,mo) and (h,mi) respectively. The 
sampled visibility at the i-th point on the uw-plane is given by 

Vi = lQ exp(- 32 * iloU ' +molH) ) + 7l exp(- j27r(il "' +mi ^ ) ) +n t 

(15) 

As shown in 0121 . we can derive bounds for the parameter 
set = [Iq, mo, h, mi, 70, 71]. We have given a numerical 
example in Fig. |4] for this case. As seen on Fig. |4] (b), as 
the two sources come closer, the variance in estimating their 
positions increase. 




(a) (b) 

Fig. 4. Cramer Rao lower bounds for a two dimensional case. The uv 
coverage is given in (a). In (b), the total variance in estimating a source 
position, Var(lo) + Var(mo) + Var(l\) + Var(m\) is given. The axes 
indicate the separation of the two sources, i.e. |?o — W\/b and \mo — m\\/b. 
The amplitudes are fixed at 70 = 1.0, 7 1 =0.4, while the noise cr = 0.2. 
The nominal resolution is b = 1 / max{\/ u 2 + v 2 ). 

As discussed in 0, any extended source could be repre- 
sented by clean components equivalent to the Fourier compo- 
nents of the brightness distribution of that source. However, 
the accuracy of this representation is limited when we have 
finite resolution due to lower bounds in estimation of positions 
and magnitudes of those clean components. Thus, it is futile to 
make the image grid arbitrarily small hoping the accuracy of 
our modeling of extended sources improve. Any pixel based 
deconvolution algorithm would run into this limitation and this 
forces us to find alternative methods to model such sources. 

E. Deconvolution using an arbitrary basis 

Because of the limitation of modeling an extended source 
by using multiple clean components, we strive to improve this 
by using other forms of image representation. Let us define 
an arbitrary basis as 

§ = {si(u),s 2 (u), . . . ,s K (u)} (16) 

where Sfc(u) is the fc-th basis function at u, on the visibility 
plane (axis). If we observe an extended source, the i-th 
visibility point can be given as 

Vi = ^2 0kSk(ui) + n 4 (17) 

A; 

where = [9\, O2, ■ • • , 0k] T are the K parameters we need to 
estimate. Representing the bases § evaluated at u by s(u) = 
[si(u), S2(u), . . . , sk(u)] t , we have the vectorized form 

y l =s T (u t )0 + n l (18) 

and combining all visibility points in vector y we have 

y = S0 + n (19) 

where S = [s(ito), s(ui), . . . , s(ujv)] t and 

n = [ni,n 2 , . . . ,n N ] T . 

This is the well studied linear statistical model and the 
likelihood can be expressed as 

My|^^) = ^^cx P (-l( y -s^( y -s^ 

(20) 



The ML estimate is = S'y where S' is the matrix pseudo 
inverse of S. From Q, we get the Cramer-Rao lower bound 

as 

Cov{6) = (S H {a 2 !)- 1 ^)- 1 = <j 2 {S h S)-\ (21) 

Using d2TV we get the variance of estimation of the fc-th 
parameter Ok as the fc-th diagonal entry of <r 2 (S H S)^ 1 . Note 
that this is minimized if S H S = I. In other words, if we 
choose the basis such that S is unitary, we get the lowest error. 
This is the primary motivation behind having an orthonormal 
basis instead of clean components. 

F. Information theoretic bounds 

An important question we should answer is the maximum 
number of basis functions or clean components that can be 
used to represent any given source. Following the arguments 
presented in |fl3l , we see that most sources have compact 
support both in the image plane and the Fourier plane. In the 
latter case, the support is also limited by the distribution of 
sampling points (baselines). Hence we can use Landau Pollak 
theorem |14) to limit the degrees of freedom of any source 
that can be seen from any interferometer. If the support area 
in the image plane is Ai m and the corresponding support in the 
Fourier plane is A uv , then the number of degrees of freedom 
is bounded by Ai m x A uv . This can be used as a criterion to 
limit the number of clean components (hence the pixel size) as 
well as the number of basis functions that can be effectively 
used to model any given source. 

III. Results 

As an example, we present results of an observation of 
Cygnus A, using the Westerbork Synthesis Radio Telescope. 
Around 150 MHz, the source Cygnus A is barely resolved by 
the WSRT Hence, traditional clean based algorithms fail to 
perform satisfactorily. In Fig. |5](a), we have given the results 
obtained using clean. In this case, the dynamic range (ratio 
of the peak flux to the noise in the image) is about 10,000. 
However, by using shapelet basis functions we can improve 
the result to reach a dynamic range of well over 500,000 as 
seen in Fig. [5] (b). 

IV. Conclusions 

We have presented limitations of pixel based deconvolution 
of extended sources in radio interferometry. We have both 
theoretically and based on real data, shown that by using 
suitable orthonormal basis functions, we could overcome this 
limitation. Although we have chosen shapelets as our example 
basis functions, future work should focus on finding better 
basis functions in terms of performance and in terms of 
computational efficiency. 
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